Create Tune API in the Katib SDK #1951

andreyvelich · 2022-09-13T00:02:50Z

Fixes: #1585.
Inspired by KFP create_component_from_func.

This API will allow user to create Katib HP Tuning Experiment without building the Experiment YAML.
This is the first small step to simplify our Kubeflow SDKs and to avoid Kubernetes complexity.

Later we can extend this functionality (support more Katib features in a simple way), give more spec options via APIs.

Also, we can provide API (e.g. katib.list_base_images) to show list of supported images (e.g. BASE_IMAGE_TENSORFLOW, BASE_IMAGE_MXNET) with their description.

cc @kubeflow/wg-training-leads @tenzen-y @anencore94 Please give your feedback on the API design.

review-notebook-app · 2022-09-13T00:02:54Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

andreyvelich · 2022-09-13T00:09:14Z

/hold Hold until kubeflow/training-operator#1659 is merged.

coveralls · 2022-09-13T00:47:47Z

Coverage remained the same at 73.431% when pulling bce47b9 on andreyvelich:sdk-create-from-func into b1e912c on kubeflow:master.

tenzen-y · 2022-09-13T14:52:19Z

Thanks for implementing the new feature!
First, I'll review the PR submitted in training-operator repo.

anencore94

Thanks for awesome feature !!! users could generate katib experiment much easier now :)
I left some personal opinions.

anencore94 · 2022-09-14T14:20:56Z

README.md

+    # Import required packages.
+    import time


I understand why this kind of logic is necesary, but we might find out more friendly way to specify packages by users.
However, as a minimal feature, this is ok if we could instruct users that they couldn't make a mistake to miss required packages in the objective function.

I agree, but I think Kubeflow Pipelines follows the same way: https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#building-python-function-based-components
@zijianjoy Do you know if Pipelines SDK verifies that all function imports are defined properly in Python Lightweight component?

@anencore94 Any ideas how to verify this?
cc @tenzen-y @gaocegege @johnugeorge

We may need to automatically validating this after this PR. How about making an issue for tracking this feature now ?
If we could make an idea for this validation in general, we may apply this for other kf components like pipelines and training-operators and so on.

Sure, that sound good @anencore94.

anencore94 · 2022-09-14T14:29:19Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py


        return result
+
+
+def double(min: float, max: float, step: float = None):


How about gather these hyperparameter helper function to another inner module?
For example, the users could call these functions by

import kubeflow.katib as katib hp_search_space = { "a": katib.hp_search_space.int(min=10, max=20), "b": katib.hp_search_space.double(min=0.1, max=0.2) } katib_client.tune( name=name, objective=objective, hp_search_space=hp_search_space, objective_metric_name="result", max_trial_count=12 )

IMO, importing by katib.int makes me it's too board that we couldn't see it is for hyperparameter search space type immediately.

I was even thinking to change this API in the future to support various distributions: uniform, loguniform, categorical, etc.

FYI, hyperopt uses hp module, skop uses space module, optuna uses distributions module, Ray uses tune module. I think we can learn something from it.

search module (e.g. katib.search.int(1,5)) looks good to me, but I am not sure what is the best UX here.

What others think @terrytangyuan @gaocegege @johnugeorge @c-bata @a9p @g-votte @tenzen-y ?

Ref #1951 (comment).

IMO it feels more natural if we could support built-in lists or common numpy arrays, etc. Is there a particular motivation for bringing a separate module in the Katib SDK just to do the same thing?

@terrytangyuan List and numpy arrays might not be enough.
How we can identify search space for uniform or loguniform in the future?

e.g. Hyperopt utilises some numpy APIs, but it still has it's own functions for the search space: https://github.com/hyperopt/hyperopt/blob/master/hyperopt/pyll/stochastic.py#L41-L43

The functions in that module seem to be very thin wrappers. If we could completely avoid creating additional wrappers/layers, it would be great.

I agree with @andreyvelich.
In this PR, we can start in the minimal feature. So, we can support other distributions in other PRs.

@terrytangyuan Do you have any ideas how we can improve UX on this ?

I don't have a strong preference. I am ok as is if other projects are following the same convention.

sdk/python/v1beta1/kubeflow/katib/constants/constants.py

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

anencore94 · 2022-09-14T14:40:43Z

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

+        algorithm_name: str = "random",
+        objective_metric_name: str = None,
+        additional_metric_names: List[str] = [],
+        objective_type: str = "maximize",


maybe we could use "maximize" and "minimize" as an Enum class or Literal

I was also thinking about it.
@anencore94 Is there a way to automatically generate enums for our Python Models ?
In our APIs type param is enum: https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/common/v1beta1/common_types.go#L96

By "automatically" means that automatically generate python Enum class from the golang file or openapi.json ?

By "automatically" means that automatically generate python Enum class from the golang file or openapi.json ?

Exactly. I am not sure if kube-openapi supports enum generation. Check this: kubernetes/enhancements#2887.
Maybe we should update kube-openapi version for this feature.

I have no many experience with it sadly. How about make an issue with this feature, and merge this PR with hardcoded for now ?

Sounds good @anencore94 !

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py

sdk/python/v1beta1/kubeflow/katib/constants/constants.py

README.md

Modify packages_to_install doc Create validate objective function

tenzen-y

@andreyvelich Thanks for your contributions!
I left a few comments.

sdk/python/v1beta1/setup.py

sdk/python/v1beta1/kubeflow/katib/constants/constants.py

Change k8s version package

andreyvelich · 2022-09-30T14:43:54Z

Thanks everyone for review!
I addressed the remaining comments and create separate search module for HPs.
Please take a look @anencore94 @tenzen-y @johnugeorge @terrytangyuan

tenzen-y

@andreyvelich Thanks for updating this!

sdk/python/v1beta1/kubeflow/katib/api/search.py

README.md

Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

tenzen-y · 2022-09-30T16:32:36Z

Thanks for addressing my comments!
/lgtm

johnugeorge · 2022-10-04T06:03:37Z

/approve

google-oss-prow · 2022-10-04T06:03:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andreyvelich, johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [andreyvelich,johnugeorge]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

johnugeorge · 2022-10-04T06:03:59Z

Thanks @andreyvelich for this great feature

andreyvelich · 2022-10-04T21:18:35Z

Thanks everyone for your feedback and review!
I am looking forward for the new Katib usage.
/hold cancel

terrytangyuan

This is great!

/lgtm

Create Tune API in the Katib SDK

86d3a68

google-oss-prow bot requested review from sperlingxx and tenzen-y September 13, 2022 00:02

google-oss-prow bot added approved size/XXL labels Sep 13, 2022

google-oss-prow bot added the do-not-merge/hold label Sep 13, 2022

andreyvelich requested a review from a team September 14, 2022 11:33

anencore94 reviewed Sep 14, 2022

View reviewed changes

tenzen-y mentioned this pull request Sep 14, 2022

Create TFJob and PyTorchJob from Function APIs in the Training SDK kubeflow/training-operator#1659

Merged

terrytangyuan reviewed Sep 15, 2022

View reviewed changes

sdk/python/v1beta1/kubeflow/katib/constants/constants.py Outdated Show resolved Hide resolved

terrytangyuan reviewed Sep 15, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Add Final to consts

1b77b24

Modify packages_to_install doc Create validate objective function

tenzen-y reviewed Sep 16, 2022

View reviewed changes

sdk/python/v1beta1/setup.py Outdated Show resolved Hide resolved

sdk/python/v1beta1/kubeflow/katib/constants/constants.py Show resolved Hide resolved

andreyvelich added 2 commits September 16, 2022 22:22

Add GPU TF Image

816867b

Change k8s version package

Create search module

316d4a4

tenzen-y reviewed Sep 30, 2022

View reviewed changes

sdk/python/v1beta1/kubeflow/katib/api/search.py Outdated Show resolved Hide resolved

tenzen-y reviewed Sep 30, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

andreyvelich and others added 2 commits September 30, 2022 17:29

Fix link in README

0dd7733

Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

Fix licence date

bce47b9

google-oss-prow bot assigned tenzen-y Sep 30, 2022

google-oss-prow bot added the lgtm label Sep 30, 2022

This was referenced Oct 4, 2022

[SDK] Improve Validation for Objective Function #1968

Open

[SDK] Generate Enum Types for Katib APIs #1969

Open

google-oss-prow bot removed the do-not-merge/hold label Oct 4, 2022

google-oss-prow bot merged commit 96ab64b into kubeflow:master Oct 4, 2022

terrytangyuan reviewed Oct 4, 2022

View reviewed changes

andreyvelich deleted the sdk-create-from-func branch October 4, 2022 21:24

andreyvelich mentioned this pull request Nov 8, 2022

Katib v0.15.0 Roadmap #1993

Closed

13 tasks

andreyvelich mentioned this pull request Jan 27, 2023

Fix Release Script for Updating SDK Version #2104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Tune API in the Katib SDK #1951

Create Tune API in the Katib SDK #1951

andreyvelich commented Sep 13, 2022 •

edited

Loading

review-notebook-app bot commented Sep 13, 2022

andreyvelich commented Sep 13, 2022

coveralls commented Sep 13, 2022 •

edited

Loading

tenzen-y commented Sep 13, 2022

anencore94 left a comment

anencore94 Sep 14, 2022

andreyvelich Sep 15, 2022 •

edited

Loading

anencore94 Sep 17, 2022

andreyvelich Sep 20, 2022

anencore94 Sep 14, 2022

andreyvelich Sep 15, 2022 •

edited

Loading

terrytangyuan Sep 15, 2022

andreyvelich Sep 15, 2022 •

edited

Loading

terrytangyuan Sep 16, 2022

tenzen-y Sep 16, 2022

andreyvelich Sep 16, 2022

terrytangyuan Sep 21, 2022 •

edited

Loading

anencore94 Sep 14, 2022

andreyvelich Sep 15, 2022 •

edited

Loading

anencore94 Sep 16, 2022

andreyvelich Sep 16, 2022

anencore94 Sep 17, 2022

andreyvelich Sep 20, 2022

tenzen-y left a comment

andreyvelich commented Sep 30, 2022

tenzen-y left a comment

tenzen-y commented Sep 30, 2022

johnugeorge commented Oct 4, 2022

google-oss-prow bot commented Oct 4, 2022

johnugeorge commented Oct 4, 2022

andreyvelich commented Oct 4, 2022

terrytangyuan left a comment


		return result


		def double(min: float, max: float, step: float = None):

Create Tune API in the Katib SDK #1951

Create Tune API in the Katib SDK #1951

Conversation

andreyvelich commented Sep 13, 2022 • edited Loading

review-notebook-app bot commented Sep 13, 2022

andreyvelich commented Sep 13, 2022

coveralls commented Sep 13, 2022 • edited Loading

tenzen-y commented Sep 13, 2022

anencore94 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

terrytangyuan Sep 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreyvelich Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tenzen-y left a comment

Choose a reason for hiding this comment

andreyvelich commented Sep 30, 2022

tenzen-y left a comment

Choose a reason for hiding this comment

tenzen-y commented Sep 30, 2022

johnugeorge commented Oct 4, 2022

google-oss-prow bot commented Oct 4, 2022

johnugeorge commented Oct 4, 2022

andreyvelich commented Oct 4, 2022

terrytangyuan left a comment

Choose a reason for hiding this comment

andreyvelich commented Sep 13, 2022 •

edited

Loading

coveralls commented Sep 13, 2022 •

edited

Loading

andreyvelich Sep 15, 2022 •

edited

Loading

andreyvelich Sep 15, 2022 •

edited

Loading

andreyvelich Sep 15, 2022 •

edited

Loading

terrytangyuan Sep 21, 2022 •

edited

Loading

andreyvelich Sep 15, 2022 •

edited

Loading